Skip to content

Fix ArrowNotImplementedError in SalesScrutinyStudy with pyarrow >= 22#312

Open
drussellmrichie wants to merge 1 commit intolarsiusprime:masterfrom
drussellmrichie:fix/pyarrow-null-dtype-and-model-name-undefined
Open

Fix ArrowNotImplementedError in SalesScrutinyStudy with pyarrow >= 22#312
drussellmrichie wants to merge 1 commit intolarsiusprime:masterfrom
drussellmrichie:fix/pyarrow-null-dtype-and-model-name-undefined

Conversation

@drussellmrichie
Copy link
Copy Markdown

Summary

When a DataFrame column has Arrow null dtype (all-null values with no
inferred type), calling .astype(str) does not reliably convert it to
a string dtype in newer pyarrow/pandas combinations (observed with
pyarrow 22 + pandas 2.3). Subsequent string concatenation with a
large_string-typed column then raises:

ArrowNotImplementedError: Function 'binary_join_element_wise' has no
kernel matching input types (null, large_string, large_string)

This crashes SalesScrutinyStudy.__init__ for any model group.

Fix

Go through Python object dtype first (.astype(object)) before
.astype(str) on both ss_id and model_group, bypassing the Arrow
backend for this string concatenation.

Reproduction

Install openavmkit with pyarrow >= 22 and pandas >= 2.3, then run
the sales scrutiny step on any locality. The crash occurs in
SalesScrutinyStudy.__init__ at the ss_id construction lines.

When a DataFrame column has Arrow null dtype (all-null values with no
known type), calling .astype(str) does not convert it to a string dtype
with newer versions of pyarrow (observed with pyarrow 22 + pandas 2.3).
Subsequent string concatenation with a large_string-typed column raises:

  ArrowNotImplementedError: Function 'binary_join_element_wise' has no
  kernel matching input types (null, large_string, large_string)

Fix: go through Python object dtype first (.astype(object)) before
.astype(str), and apply the same to model_group to ensure both sides
of the concatenation are plain Python object strings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution.
Please sign our CLA at the following link:
Click here to sign the CLA.
A maintainer will verify your signature and confirm it here by commenting with the following sentence:


I affirm that this contributor has signed the CLA


Russell Richie seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@larsiusprime
Copy link
Copy Markdown
Owner

I affirm that this contributor has signed the CLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants